Alternating Optimisation and Quadrature for Robust Control
نویسندگان
چکیده
Bayesian optimisation has been successfully applied to a variety of reinforcement learning problems. However, the traditional approach for learning optimal policies in simulators does not utilise the opportunity to improve learning by adjusting certain environment variables: state features that are unobservable and randomly determined by the environment in a physical setting but are controllable in a simulator. This paper considers the problem of finding a robust policy while taking into account the impact of environment variables. We present Alternating Optimisation and Quadrature (ALOQ), which uses Bayesian optimisation and Bayesian quadrature to address such settings. ALOQ is robust to the presence of significant rare events, which may not be observable under random sampling, but play a substantial role in determining the optimal policy. Experimental results across different domains show that ALOQ can learn more efficiently and robustly than existing methods.
منابع مشابه
Alternating Optimisation and Quadrature for Robust Reinforcement Learning
Bayesian optimisation has been successfully applied to a variety of reinforcement learning problems. However, the traditional approach for learning optimal policies in simulators does not utilise the opportunity to improve learning by adjusting certain environment variables – state features that are randomly determined by the environment in a physical setting but are controllable in a simulator...
متن کاملEmergency department resource optimisation for improved performance: a review
Emergency departments (EDs) have been becoming increasingly congested due to the combined impacts of growing demand, access block and increased clinical capability of the EDs. This congestion has known to have adverse impacts on the performance of the healthcare services. Attempts to overcome with this challenge have focussed largely on the demand management and the application of system wide p...
متن کاملRobust Control of Room Temperature and Relative Humidity Using Advanced Nonlinear Inverse Dynamics and Evolutionary Optimisation
A robust controller is developed, using advanced nonlinear inverse dynamics (NID) controller design and genetic algorithm optimisation, for room temperature control. The performance is evaluated through application to a single zone dynamic building model. The proposed controller produces superior performance when compared to the NID controller optimised with a simple optimisation algorithm, and...
متن کاملA multi-objective optimisation-based software environment for control systems design
Multi-objective optimisation is a proven, well-known parameter tuning technique in control design. It is especially suited to solve complex, multi-disciplinary design problems. This paper describes a software environment, called MOPS (Multi-Objective Parameter Synthesis), which supports the control engineer in setting up his design problem as a properly formulated multi-objective optimisation t...
متن کاملEstimation de la réflectance à partir de données multi-vues
We introduce a variational framework for separating shading and reflectance from a series of images acquired under different angles, when the geometry has already been estimated by multi-view stereo. Our formulation uses an l-TV variational framework, where a robust photometricbased data term enforces adequation to the images, total variation ensures piecewise-smoothness of the reflectance, and...
متن کامل